Skip to content

Conversation

@wangxiyuan
Copy link
Collaborator

@wangxiyuan wangxiyuan commented Nov 24, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

gemini-code-assist[bot]

This comment was marked as spam.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@wangxiyuan wangxiyuan force-pushed the 4142 branch 2 times, most recently from b4ebebe to cb01cd8 Compare November 24, 2025 02:20
@wangxiyuan wangxiyuan changed the title Upgrade to v0.11.2 Upgrade vLLM to v0.11.2 Nov 24, 2025
@wangxiyuan wangxiyuan mentioned this pull request Nov 24, 2025
@github-actions github-actions bot added documentation Improvements or additions to documentation module:tests module:core labels Nov 24, 2025
leo-pony and others added 20 commits November 24, 2025 10:58
…tructured outputs compatibility#26866

Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: 22dimensions <[email protected]>
Signed-off-by: 22dimensions <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
leo-pony and others added 20 commits November 24, 2025 10:58
Signed-off-by: 22dimensions <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: shen-shanshan <[email protected]>
Signed-off-by: wangxiyuan <[email protected]>
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@zhangxinyuehfad
Copy link
Contributor

zhangxinyuehfad commented Nov 24, 2025

@leo-pony
Multi-Node-Ray test failed:
log:

(EngineCore_DP0 pid=300679) (RayWorkerWrapper pid=300872) INFO 11-24 08:50:32 [__init__.py:106] Registered model loader `<class 'vllm_ascend.model_loader.netloader.netloader.ModelNetLoaderElastic'>` with load format `netloader`
(EngineCore_DP0 pid=300679) (RayWorkerWrapper pid=300872) WARNING 11-24 08:50:33 [worker_base.py:301] Missing `shared_worker_lock` argument from executor. This argument is needed for mm_processor_cache_type='shm'.
(EngineCore_DP0 pid=300679) (RayWorkerWrapper pid=300872) INFO 11-24 08:50:33 [utils.py:973] FLASHCOMM2 not enable.
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842] EngineCore failed to start.
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842] Traceback (most recent call last):
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 833, in run_engine_core
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 606, in __init__
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     super().__init__(
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 102, in __init__
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 101, in __init__
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self._init_executor()
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/ray_executor.py", line 97, in _init_executor
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self._init_workers_ray(placement_group)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/ray_executor.py", line 370, in _init_workers_ray
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self.collective_rpc("init_device")
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/ray_executor.py", line 493, in collective_rpc
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     return ray.get(ray_worker_outputs, timeout=timeout)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/usr/local/python3.11.13/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     return fn(*args, **kwargs)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/usr/local/python3.11.13/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     return func(*args, **kwargs)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/usr/local/python3.11.13/lib/python3.11/site-packages/ray/_private/worker.py", line 2858, in get
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/usr/local/python3.11.13/lib/python3.11/site-packages/ray/_private/worker.py", line 958, in get_objects
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     raise value.as_instanceof_cause()
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842] ray.exceptions.RayTaskError(AssertionError): ray::RayWorkerWrapper.execute_method() (pid=300878, ip=172.22.0.188, actor_id=ccad69f02f06cafa8981145201000000, repr=<vllm.v1.executor.ray_utils.RayWorkerWrapper object at 0xffcfbc328810>)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 343, in execute_method
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     raise e
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 332, in execute_method
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     return run_method(self, method, args, kwargs)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/serial_utils.py", line 479, in run_method
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     return func(*args, **kwargs)
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/worker/worker_base.py", line 324, in init_device
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self.worker.init_device()  # type: ignore
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 236, in init_device
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     self.device = self._init_device()
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]                   ^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]   File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 220, in _init_device
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]     assert self.parallel_config.local_world_size <= visible_device_count, (
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=300679) ERROR 11-24 08:50:34 [core.py:842] AssertionError: local_world_size (32) must be less than or equal to the number of visible devices (16).

@zhangxinyuehfad
Copy link
Contributor

@wangxiyuan @MengqingCao
Multi-Node-DP test failed:
log:

INFO 11-24 09:11:25 [__init__.py:217] Platform plugin ascend is activated
Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)INFO 11-24 09:11:26 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 11-24 09:11:26 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)INFO 11-24 09:11:28 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 11-24 09:11:28 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)Error.  nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)INFO 11-24 09:11:29 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 11-24 09:11:29 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
(Worker_DP0_TP1_EP1 pid=322215) INFO 11-24 09:11:30 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP0_EP0 pid=322214) INFO 11-24 09:11:30 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP6_EP6 pid=322220) INFO 11-24 09:11:31 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP7_EP7 pid=322221) INFO 11-24 09:11:31 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP4_EP4 pid=322218) INFO 11-24 09:11:33 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP5_EP5 pid=322219) INFO 11-24 09:11:33 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP3_EP3 pid=322217) INFO 11-24 09:11:34 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(Worker_DP0_TP2_EP2 pid=322216) INFO 11-24 09:11:35 [model_runner_v1.py:3746] Loading model weights took 29.0584 GB
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842] EngineCore failed to start.
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842] Traceback (most recent call last):
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 829, in run_engine_core
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 1124, in __init__
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     super().__init__(
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 606, in __init__
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     super().__init__(
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 109, in __init__
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     num_gpu_blocks, num_cpu_blocks, kv_cache_config = self._initialize_kv_caches(
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/engine/core.py", line 215, in _initialize_kv_caches
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     kv_cache_specs = self.model_executor.get_kv_cache_specs()
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/abstract.py", line 129, in get_kv_cache_specs
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     return self.collective_rpc("get_kv_cache_spec")
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]   File "/vllm-workspace/vllm/vllm/v1/executor/multiproc_executor.py", line 354, in collective_rpc
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]     while self.futures_queue:
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842]           ^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=321405) ERROR 11-24 09:11:35 [core.py:842] AttributeError: 'AscendMultiprocExecutor' object has no attribute 'futures_queue'
(ApiServer_1 pid=321407) Process ApiServer_1:
(ApiServer_1 pid=321407) Traceback (most recent call last):
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
(ApiServer_1 pid=321407)     self.run()
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 108, in run
(ApiServer_1 pid=321407)     self._target(*self._args, **self._kwargs)
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/entrypoints/cli/serve.py", line 247, in run_api_server_worker_proc
(ApiServer_1 pid=321407)     uvloop.run(
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 92, in run
(ApiServer_1 pid=321407)     return runner.run(wrapper())
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/asyncio/runners.py", line 118, in run
(ApiServer_1 pid=321407)     return self._loop.run_until_complete(task)
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 48, in wrapper
(ApiServer_1 pid=321407)     return await main
(ApiServer_1 pid=321407)            ^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 2043, in run_server_worker
(ApiServer_1 pid=321407)     async with build_async_engine_client(
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__
(ApiServer_1 pid=321407)     return await anext(self.gen)
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 195, in build_async_engine_client
(ApiServer_1 pid=321407)     async with build_async_engine_client_from_engine_args(
(ApiServer_1 pid=321407)   File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__
(ApiServer_1 pid=321407)     return await anext(self.gen)
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 236, in build_async_engine_client_from_engine_args
(ApiServer_1 pid=321407)     async_llm = AsyncLLM.from_vllm_config(
(ApiServer_1 pid=321407)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/utils/func_utils.py", line 116, in inner
(ApiServer_1 pid=321407)     return fn(*args, **kwargs)
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 203, in from_vllm_config
(ApiServer_1 pid=321407)     return cls(
(ApiServer_1 pid=321407)            ^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 133, in __init__
(ApiServer_1 pid=321407)     self.engine_core = EngineCoreClient.make_async_mp_client(
(ApiServer_1 pid=321407)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 120, in make_async_mp_client
(ApiServer_1 pid=321407)     return DPLBAsyncMPClient(*client_args)
(ApiServer_1 pid=321407)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 1176, in __init__
(ApiServer_1 pid=321407)     super().__init__(
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 1017, in __init__
(ApiServer_1 pid=321407)     super().__init__(
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 808, in __init__
(ApiServer_1 pid=321407)     super().__init__(
(ApiServer_1 pid=321407)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 523, in __init__
(ApiServer_1 pid=321407)     raise TimeoutError(
(ApiServer_1 pid=321407) TimeoutError: Timed out waiting for engines to sendinitial message on input socket.
(ApiServer_0 pid=321406) Process ApiServer_0:
(ApiServer_0 pid=321406) Traceback (most recent call last):
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
(ApiServer_0 pid=321406)     self.run()
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/multiprocessing/process.py", line 108, in run
(ApiServer_0 pid=321406)     self._target(*self._args, **self._kwargs)
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/entrypoints/cli/serve.py", line 247, in run_api_server_worker_proc
(ApiServer_0 pid=321406)     uvloop.run(
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 92, in run
(ApiServer_0 pid=321406)     return runner.run(wrapper())
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/asyncio/runners.py", line 118, in run
(ApiServer_0 pid=321406)     return self._loop.run_until_complete(task)
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/site-packages/uvloop/__init__.py", line 48, in wrapper
(ApiServer_0 pid=321406)     return await main
(ApiServer_0 pid=321406)            ^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 2043, in run_server_worker
(ApiServer_0 pid=321406)     async with build_async_engine_client(
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__
(ApiServer_0 pid=321406)     return await anext(self.gen)
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 195, in build_async_engine_client
(ApiServer_0 pid=321406)     async with build_async_engine_client_from_engine_args(
(ApiServer_0 pid=321406)   File "/usr/local/python3.11.13/lib/python3.11/contextlib.py", line 210, in __aenter__
(ApiServer_0 pid=321406)     return await anext(self.gen)
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/entrypoints/openai/api_server.py", line 236, in build_async_engine_client_from_engine_args
(ApiServer_0 pid=321406)     async_llm = AsyncLLM.from_vllm_config(
(ApiServer_0 pid=321406)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/utils/func_utils.py", line 116, in inner
(ApiServer_0 pid=321406)     return fn(*args, **kwargs)
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 203, in from_vllm_config
(ApiServer_0 pid=321406)     return cls(
(ApiServer_0 pid=321406)            ^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/async_llm.py", line 133, in __init__
(ApiServer_0 pid=321406)     self.engine_core = EngineCoreClient.make_async_mp_client(
(ApiServer_0 pid=321406)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 120, in make_async_mp_client
(ApiServer_0 pid=321406)     return DPLBAsyncMPClient(*client_args)
(ApiServer_0 pid=321406)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 1176, in __init__
(ApiServer_0 pid=321406)     super().__init__(
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 1017, in __init__
(ApiServer_0 pid=321406)     super().__init__(
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 808, in __init__
(ApiServer_0 pid=321406)     super().__init__(
(ApiServer_0 pid=321406)   File "/vllm-workspace/vllm/vllm/v1/engine/core_client.py", line 523, in __init__
(ApiServer_0 pid=321406)     raise TimeoutError(
(ApiServer_0 pid=321406) TimeoutError: Timed out waiting for engines to sendinitial message on input socket.


@wangxiyuan
Copy link
Collaborator Author

see: #4400

@wangxiyuan wangxiyuan closed this Nov 24, 2025
@wangxiyuan wangxiyuan deleted the 4142 branch December 4, 2025 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants